** Data reading and variable selection from raw data
** 2008 National Family Research of Japan


** 01. Reading data **

cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/  


** 02. Constructing year and country variables **

ge year=2008
lab var year "survey year"

ge country=392
lab var country "ISO country code"
//Japan: 392 (see "ISO Country Codes.pdf) 


** 03. ID variables **

ge pid=NO
lab var pid "person id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=R1
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=R2A
lab var age "age"

ge birthyr=year-age
lab var birthyr "year of birth"


** 05. Siblings **

ge nbro=R15RLOB+R15RLYB
lab var nbro "number of brothers"
ge nsis=R15RLOS+R15RLYS
lab var nsis "number of sisters"

ge nsibs=nbro+nsis
replace nsibs=999 if nsibs>=999
lab var nsibs "number of siblings"
lab def nsibs 999 "don't know/no answer"
lab val nsibs nsibs

ge birthorder=R15RLOB+R15RLOS+1
replace birthorder=999 if birthorder>=999
lab var birthorder "birth order"
lab def birthorder 999 "don't know/no answer"
lab val birthorder birthorder


** 06. Own education **

rename R3_1 educ_a 
lab var educ_a "last school attended"

** 07. Parents' education: Father and/or Mother **

rename R14F_02 faeduc_a
lab var faeduc_a "father's last school attended"

rename R14M_02 moeduc_a
lab var moeduc_a "mother's last school attended"


** 08. Own occupation **

rename R5 workstat
lab var workstat "respondent's work status"
rename R5S2 occ
lab var occ "respondent's current occupation"
rename R5S1 work
lab var work "respondent's current type of employment"
rename R5S3 numemp
lab var numemp "number of employees"

rename R5S8_2 firstwork
lab var firstwork "respondent's first job - type of employment"
rename R5S8_3 firstocc
lab var firstocc "respondent's first occupation"
rename R5S8_4 firstnumemp
lab var firstnumemp "respondent's first job - number of employees"


** 09. Parents' occupation **

rename R14S1F_01 faworkstat
lab var faworkstat "father's current employment status"
rename R14S1M_01 moworkstat
lab var moworkstat "mother's current employment status"


** 10. Tabulate the Identified Variables **

log using /*insert your work directory here*/, replace text

** Data reading and variable selection from raw data
** 2008 National Family Research of Japan

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nbro nsis birthorder, d

** R's Own Education **
tab1 educ 

** Parental Education **
tab1 faeduc moeduc 

** R's Own Occupation **
tab1 workstat occ work numemp 

** Parental Occupation **
tab1 faworkstat moworkstat 

log close

** 11. Keep the identified variables only

keep year country pid sex age birthyr ///
	 nbro nsis nsibs birthorder ///
	 educ faeduc_a moeduc_a ///
	 workstat occ work numemp ///
	 faworkstat moworkstat


** 12. Save the Data File **

saveold /*insert your work diretory here*/, replace



** 13. Harmonizing education **
** Own Education **
rename educ educ_cat

ge educ_yrs=9 if educ_cat==1
replace educ_yrs=12 if educ_cat==2
replace educ_yrs=14 if educ_cat==3
replace educ_yrs=14 if educ_cat==4
replace educ_yrs=16 if educ_cat==5
replace educ_yrs=20 if educ_cat==6
replace educ_yrs=4 if educ_cat==7
replace educ_yrs=. if educ_cat==9999
lab var educ_yrs "respondent highest education in years"

ge educ_ISCED=200 if educ_cat==1
replace educ_ISCED=300 if educ_cat==2
replace educ_ISCED=500 if educ_cat==3
replace educ_ISCED=500 if educ_cat==4
replace educ_ISCED=600 if educ_cat==5
replace educ_ISCED=700 if educ_cat==6
replace educ_ISCED=. if educ_cat==7
replace educ_ISCED=. if educ_cat==9999
lab var educ_ISCED "respondent highest education in ISCED code"


** Parents Education **

ge faeduc_flag=1 

rename faeduc_a faeduc_cat
rename moeduc_a maeduc_cat

ge faeduc_yrs=9 if faeduc_cat==1
replace faeduc_yrs=12 if faeduc_cat==2
replace faeduc_yrs=14 if faeduc_cat==3
replace faeduc_yrs=14 if faeduc_cat==4
replace faeduc_yrs=16 if faeduc_cat==5
replace faeduc_yrs=20 if faeduc_cat==6
replace faeduc_yrs=4 if faeduc_cat==7
replace faeduc_yrs=. if faeduc_cat==8 | faeduc_cat==9999
lab var faeduc_yrs "father's education in years"

ge maeduc_yrs=9 if maeduc_cat==1
replace maeduc_yrs=12 if maeduc_cat==2
replace maeduc_yrs=14 if maeduc_cat==3
replace maeduc_yrs=14 if maeduc_cat==4
replace maeduc_yrs=16 if maeduc_cat==5
replace maeduc_yrs=20 if maeduc_cat==6
replace maeduc_yrs=4 if maeduc_cat==7
replace maeduc_yrs=. if maeduc_cat==8 | maeduc_cat==9999
lab var maeduc_yrs "mother's education in years"

ge faeduc_ISCED=200 if faeduc_cat==1
replace faeduc_ISCED=300 if faeduc_cat==2
replace faeduc_ISCED=500 if faeduc_cat==3
replace faeduc_ISCED=500 if faeduc_cat==4
replace faeduc_ISCED=600 if faeduc_cat==5
replace faeduc_ISCED=700 if faeduc_cat==6
replace faeduc_ISCED=. if faeduc_cat==7
replace faeduc_ISCED=. if faeduc_cat==9999
lab var faeduc_ISCED "father highest education in ISCED code"

ge maeduc_ISCED=200 if maeduc_cat==1
replace maeduc_ISCED=300 if maeduc_cat==2
replace maeduc_ISCED=500 if maeduc_cat==3
replace maeduc_ISCED=500 if maeduc_cat==4
replace maeduc_ISCED=600 if maeduc_cat==5
replace maeduc_ISCED=700 if maeduc_cat==6
replace maeduc_ISCED=. if maeduc_cat==7
replace maeduc_ISCED=. if maeduc_cat==9999
lab var maeduc_ISCED "mother highest education in ISCED code"

** 14. Harmonizing sibling **

ge nbro_flag=99
lab var nbro_flag "cutoff of number of brothers"
ge nsis_flag=99
lab var nsis_flag "cutoff of number of sisters"
ge nsibs_flag=99
lab var nsibs_flag "cutoff of total number of siblings"

lab def nsib_flag 99 "no cutoff"
lab val nbro_flag nsis_flag nsibs_flag nsib_flag

//recode missing
replace nbro=. if nbro==19998 | nbro==9999
replace nsis=. if nsis==19998 | nsis==9999
replace nsibs=. if nsibs==999

** 15. Tab Education and Sibling Variables **
tab1 sex age birthyr
tab1 educ_cat educ_yrs faeduc_cat faeduc_yrs maeduc_cat maeduc_yrs faeduc_flag 
tab1 nbro nsis nsibs nbro_flag nsis_flag nsibs_flag


** 16. Save the Data File **

saveold /*insert your work diretory here*/, replace

